Add Wheel & Docs: Flash Attention 2.7.4.post1 for Py3.12 / CUDA 12.1 / PyTorch 2.5.1 #4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add Wheel & Docs: Flash Attention 2.7.4.post1 for Py3.12 / CUDA 12.1 / PyTorch 2.5.1
This PR adds support for Python 3.12 users by providing a pre-compiled wheel for
flash-attn v2.7.4.post1
and updating the documentation accordingly.Wheel Details:
flash-attn
2.7.4.post1
cp312
)2.5.1+cu121
Wheel File Location:
.whl
file (flash_attn-2.7.4.post1-cp312-cp312-win_amd64.whl
) has been uploaded as a binary asset to the release taggedv2.7.4.post1_py312_cu121_torch251
on my fork (Wisdawn/flash-attention-windows
). It is not included directly in this PR's file changes due to GitHub's file size limits.README.md Updates in this PR:
Hoping this contribution helps other Windows users! Let me know if any changes are needed.